Nasarawa State
Tensor Network Based Feature Learning Model
Saiapin, Albert, Batselier, Kim
Many approximations were suggested to circumvent the cubic complexity of kernel-based algorithms, allowing their application to large-scale datasets. One strategy is to consider the primal formulation of the learning problem by mapping the data to a higher-dimensional space using tensor-product structured polynomial and Fourier features. The curse of dimensionality due to these tensor-product features was effectively solved by a tensor network reparameterization of the model parameters. However, another important aspect of model training - identifying optimal feature hyperparameters - has not been addressed and is typically handled using the standard cross-validation approach. In this paper, we introduce the Feature Learning (FL) model, which addresses this issue by representing tensor-product features as a learnable Canonical Polyadic Decomposition (CPD). By leveraging this CPD structure, we efficiently learn the hyperparameters associated with different features alongside the model parameters using an Alternating Least Squares (ALS) optimization method. We prove the effectiveness of the FL model through experiments on real data of various dimensionality and scale. The results show that the FL model can be consistently trained 3-5 times faster than and have the prediction quality on par with a standard cross-validated model.
- Africa > Senegal > Kolda Region > Kolda (0.05)
- Europe > Netherlands > South Holland > Delft (0.05)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- (5 more...)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)
- (12 more...)
- Workflow (0.67)
- Overview (0.67)
- Research Report > New Finding (0.45)
- Information Technology (1.00)
- Health & Medicine (1.00)
- Energy (1.00)
- (3 more...)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (0.93)
- Law (0.67)
- Energy (0.46)
- Information Technology (0.46)
- Health & Medicine (0.46)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (0.93)
- Law (0.67)
- Energy (0.46)
- Information Technology (0.46)
- Health & Medicine (0.46)
Neighborhood Sampling Does Not Learn the Same Graph Neural Network
Niu, Zehao, Anitescu, Mihai, Chen, Jie
Neighborhood sampling is an important ingredient in the training of large-scale graph neural networks. It suppresses the exponential growth of the neighborhood size across network layers and maintains feasible memory consumption and time costs. While it becomes a standard implementation in practice, its systemic behaviors are less understood. We conduct a theoretical analysis by using the tool of neural tangent kernels, which characterize the (analogous) training dynamics of neural networks based on their infinitely wide counterparts -- Gaussian processes (GPs). We study several established neighborhood sampling approaches and the corresponding posterior GP. With limited samples, the posteriors are all different, although they converge to the same one as the sample size increases. Moreover, the posterior covariance, which lower-bounds the mean squared prediction error, is uncomparable, aligning with observations that no sampling approach dominates.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Africa > Nigeria > Nasarawa State > Lafia (0.04)
Evaluating the Performance of Nigerian Lecturers using Multilayer Perceptron
Ezeibe, I. E., Okide, S. O., Asogwa, D. C.
Evaluating the performance of a lecturer has been essential for enhancing teaching quality, improving student learning outcomes, and strengthening the institution's reputation. The absence of such a system brings about lecturer performance evaluation which was neither comprehensive nor holistic. This system was designed using a web-based platform, created a secure database, and by using a custom dataset, captured some performance metrics which included student evaluation scores, Research Publications, Years of Experience, and Administrative Duties. Multilayer Perceptron (MLP) algorithm was utilized due to its ability to process complex data patterns and generates accurate predictions in a lecturer's performance based on historical data. This research focused on designing multiple performance metrics beyond the standard ones, incorporating student participation, and integrating analytical tools to deliver a comprehensive and holistic evaluation of lecturers' performance and was developed using Object-Oriented Analysis and Design (OOAD) methodology. Lecturers' performance is evaluated by the model, and the evaluation accuracy is about 91% compared with actual performance. Finally, by evaluating the performance of the MLP model, it is concluded that MLP enhanced lecturer performance evaluation by providing accurate predictions, reducing bias, and supporting data-driven decisions, ultimately improving the fairness and efficiency of the evaluation process. The MLP model's performance was evaluated using Mean Squared Error (MSE) and Mean Absolute Error (MAE), achieved a test loss (MSE) of 256.99 and a MAE of 13.76, and reflected a high level of prediction accuracy. The model also demonstrated an estimated accuracy rate of approximately 96%, validated its effectiveness in predicting lecturer performance.
- Research Report (0.66)
- Instructional Material > Course Syllabus & Notes (0.48)
Multimodal LLMs for OCR, OCR Post-Correction, and Named Entity Recognition in Historical Documents
Greif, Gavin, Griesshaber, Niclas, Greif, Robin
We explore how multimodal Large Language Models (mLLMs) can help researchers transcribe historical documents, extract relevant historical information, and construct datasets from historical sources. Specifically, we investigate the capabilities of mLLMs in performing (1) Optical Character Recognition (OCR), (2) OCR Post-Correction, and (3) Named Entity Recognition (NER) tasks on a set of city directories published in German between 1754 and 1870. First, we benchmark the off-the-shelf transcription accuracy of both mLLMs and conventional OCR models. We find that the best-performing mLLM model significantly outperforms conventional state-of-the-art OCR models and other frontier mLLMs. Second, we are the first to introduce multimodal post-correction of OCR output using mLLMs. We find that this novel approach leads to a drastic improvement in transcription accuracy and consistently produces highly accurate transcriptions (<1% CER), without any image pre-processing or model fine-tuning. Third, we demonstrate that mLLMs can efficiently recognize entities in transcriptions of historical documents and parse them into structured dataset formats. Our findings provide early evidence for the long-term potential of mLLMs to introduce a paradigm shift in the approaches to historical data collection and document transcription.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Germany > Saxony > Leipzig (0.05)
- Europe > Latvia > Riga Municipality > Riga (0.04)
- (18 more...)
A Gap Between the Gaussian RKHS and Neural Networks: An Infinite-Center Asymptotic Analysis
Kumar, Akash, Parhi, Rahul, Belkin, Mikhail
Recent works have characterized the function-space inductive bias of infinite-width bounded-norm single-hidden-layer neural networks as a kind of bounded-variation-type space. This novel neural network Banach space encompasses many classical multivariate function spaces including certain Sobolev spaces and the spectral Barron spaces. Notably, this Banach space also includes functions that exhibit less classical regularity such as those that only vary in a few directions. On bounded domains, it is well-established that the Gaussian reproducing kernel Hilbert space (RKHS) strictly embeds into this Banach space, demonstrating a clear gap between the Gaussian RKHS and the neural network Banach space. It turns out that when investigating these spaces on unbounded domains, e.g., all of $\mathbb{R}^d$, the story is fundamentally different. We establish the following fundamental result: Certain functions that lie in the Gaussian RKHS have infinite norm in the neural network Banach space. This provides a nontrivial gap between kernel methods and neural networks by the exhibition of functions in which kernel methods can do strictly better than neural networks.
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- Africa > Nigeria > Nasarawa State > Lafia (0.04)
AI in Archival Science -- A Systematic Review
Shinde, Gaurav, Kirstein, Tiana, Ghosh, Souvick, Franks, Patricia C.
The rapid expansion of records creates significant challenges in management, including retention and disposition, appraisal, and organization. Our study underscores the benefits of integrating artificial intelligence (AI) within the broad realm of archival science. In this work, we start by performing a thorough analysis to understand the current use of AI in this area and identify the techniques employed to address challenges. Subsequently, we document the results of our review according to specific criteria. Our findings highlight key AI driven strategies that promise to streamline record-keeping processes and enhance data retrieval efficiency. We also demonstrate our review process to ensure transparency regarding our methodology. Furthermore, this review not only outlines the current state of AI in archival science and records management but also lays the groundwork for integrating new techniques to transform archival practices. Our research emphasizes the necessity for enhanced collaboration between the disciplines of artificial intelligence and archival science.
- North America > United States > Oklahoma > Payne County > Cushing (0.05)
- Africa > Nigeria > Nasarawa State > Lafia (0.05)
- Europe > Sweden (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Information Technology > Security & Privacy (0.68)
- Law (0.68)
- Education (0.67)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Disentangling Logic: The Role of Context in Large Language Model Reasoning Capabilities
Hua, Wenyue, Zhu, Kaijie, Li, Lingyao, Fan, Lizhou, Lin, Shuhang, Jin, Mingyu, Xue, Haochen, Li, Zelong, Wang, JinDong, Zhang, Yongfeng
This study intends to systematically disentangle pure logic reasoning and text understanding by investigating the contrast across abstract and contextualized logical problems from a comprehensive set of domains. We explore whether LLMs demonstrate genuine reasoning capabilities across various domains when the underlying logical structure remains constant. We focus on two main questions (1) Can abstract logical problems alone accurately benchmark an LLM's reasoning ability in real-world scenarios, disentangled from contextual support in practical settings? (2) Does fine-tuning LLMs on abstract logic problem generalize to contextualized logic problems and vice versa? To investigate these questions, we focus on standard propositional logic, specifically propositional deductive and abductive logic reasoning. In particular, we construct instantiated datasets for deductive and abductive reasoning with 4 levels of difficulty, encompassing 12 distinct categories or domains based on the categorization of Wikipedia. Our experiments aim to provide insights into disentangling context in logical reasoning and the true reasoning capabilities of LLMs and their generalization potential. The code and dataset are available at: https://github.com/agiresearch/ContextHub.
- Asia > South Korea (0.04)
- North America > United States > Michigan (0.04)
- Africa > Nigeria > Nasarawa State > Lafia (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
- Leisure & Entertainment > Sports (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.69)